Goto

Collaborating Authors

 low-level model






A method of supervised learning from conflicting data with hidden contexts

Zhang, Tianren, Jiang, Yizhou, Chen, Feng

arXiv.org Artificial Intelligence

Conventional supervised learning assumes a stable input-output relationship. However, this assumption fails in open-ended training settings where the input-output relationship depends on hidden contexts. In this work, we formulate a more general supervised learning problem in which training data is drawn from multiple unobservable domains, each potentially exhibiting distinct input-output maps. This inherent conflict in data renders standard empirical risk minimization training ineffective. To address this challenge, we propose a method LEAF that introduces an allocation function, which learns to assign conflicting data to different predictive models. We establish a connection between LEAF and a variant of the Expectation-Maximization algorithm, allowing us to derive an analytical expression for the allocation function. Finally, we provide a theoretical analysis of LEAF and empirically validate its effectiveness on both synthetic and real-world tasks involving conflicting data.


Efficient Neural Hybrid System Learning and Transition System Abstraction for Dynamical Systems

Yang, Yejiang, Mo, Zihao, Xiang, Weiming

arXiv.org Artificial Intelligence

Abstract: This paper proposes a neural network hybrid modeling framework for dynamics learning to promote an interpretable, computationally efficient way of dynamics learning and system identification. First, a low-level model will be trained to learn the system dynamics, which utilizes multiple simple neural networks to approximate the local dynamics generated from data-driven partitions. Then, based on the low-level model, a high-level model will be trained to abstract the low-level neural hybrid system model into a transition system that allows Computational Tree Logic Verification to promote the model's ability with human interaction and verification efficiency. Keywords: Hybrid and Distributed System Modeling; Neural Networks; Nonlinear System Modeling; Maximum-Entropy Partitioning; Model Abstraction.


InterpBench: Semi-Synthetic Transformers for Evaluating Mechanistic Interpretability Techniques

Gupta, Rohan, Arcuschin, Iván, Kwa, Thomas, Garriga-Alonso, Adrià

arXiv.org Artificial Intelligence

Mechanistic interpretability methods aim to identify the algorithm a neural network implements, but it is difficult to validate such methods when the true algorithm is unknown. This work presents InterpBench, a collection of semi-synthetic yet realistic transformers with known circuits for evaluating these techniques. We train these neural networks using a stricter version of Interchange Intervention Training (IIT) which we call Strict IIT (SIIT). Like the original, SIIT trains neural networks by aligning their internal computation with a desired high-level causal model, but it also prevents non-circuit nodes from affecting the model's output. We evaluate SIIT on sparse transformers produced by the Tracr tool and find that SIIT models maintain Tracr's original circuit while being more realistic. SIIT can also train transformers with larger circuits, like Indirect Object Identification (IOI). Finally, we use our benchmark to evaluate existing circuit discovery techniques.


Targeted Reduction of Causal Models

Kekić, Armin, Schölkopf, Bernhard, Besserve, Michel

arXiv.org Machine Learning

Why does a phenomenon occur? Addressing this question is central to most scientific inquiries based on empirical observations, and often heavily relies on simulations of scientific models. As models become more intricate, deciphering the causes behind these phenomena in high-dimensional spaces of interconnected variables becomes increasingly challenging. Causal machine learning may assist scientists in the discovery of relevant and interpretable patterns of causation in simulations. We introduce Targeted Causal Reduction (TCR), a method for turning complex models into a concise set of causal factors that explain a specific target phenomenon. We derive an information theoretic objective to learn TCR from interventional data or simulations and propose algorithms to optimize this objective efficiently. TCR's ability to generate interpretable high-level explanations from complex models is demonstrated on toy and mechanical systems, illustrating its potential to assist scientists in the study of complex phenomena in a broad range of disciplines.


EnsembleFollower: A Hybrid Car-Following Framework Based On Reinforcement Learning and Hierarchical Planning

Han, Xu, Chen, Xianda, Zhu, Meixin, Cai, Pinlong, Zhou, Jianshan, Chu, Xiaowen

arXiv.org Artificial Intelligence

Car-following models have made significant contributions to our understanding of longitudinal driving behavior. However, they often exhibit limited accuracy and flexibility, as they cannot fully capture the complexity inherent in car-following processes, or may falter in unseen scenarios due to their reliance on confined driving skills present in training data. It is worth noting that each car-following model possesses its own strengths and weaknesses depending on specific driving scenarios. Therefore, we propose EnsembleFollower, a hierarchical planning framework for achieving advanced human-like car-following. The EnsembleFollower framework involves a high-level Reinforcement Learning-based agent responsible for judiciously managing multiple low-level car-following models according to the current state, either by selecting an appropriate low-level model to perform an action or by allocating different weights across all low-level components. Moreover, we propose a jerk-constrained kinematic model for more convincing car-following simulations. We evaluate the proposed method based on real-world driving data from the HighD dataset. The experimental results illustrate that EnsembleFollower yields improved accuracy of human-like behavior and achieves effectiveness in combining hybrid models, demonstrating that our proposed framework can handle diverse car-following conditions by leveraging the strengths of various low-level models.


Abstracting Causal Models

Beckers, Sander, Halpern, Joseph Y.

arXiv.org Artificial Intelligence

We consider a sequence of successively more restrictive definitions of abstraction for causal models, starting with a notion introduced by Rubenstein et al. (2017) called exact transformation that applies to probabilistic causal models, moving to a notion of uniform transformation that applies to deterministic causal models and does not allow differences to be hidden by the "right" choice of distribution, and then to abstraction, where the interventions of interest are determined by the map from low-level states to high-level states, and strong abstraction, which takes more seriously all potential interventions in a model, not just the allowed interventions. We show that procedures for combining micro-variables into macro-variables are instances of our notion of strong abstraction, as are all the examples considered by Rubenstein et al.